Training wide residual networks for deployment using a single bit for each weight
نویسنده
چکیده
For fast and energy-efficient deployment of trained deep neural networks on resource-constrained embedded hardware, each learned weight parameter should ideally be represented and stored using a single bit. Error-rates usually increase when this requirement is imposed. Here, we report large improvements in error rates on multiple datasets, for deep convolutional neural networks deployed with 1-bit-per-weight. Using wide residual networks as our main baseline, our approach simplifies existing methods that binarize weights by applying the sign function in training; we apply scaling factors for each layer with constant unlearned values equal to the layer-specific standard deviations used for initialization. For CIFAR-10, CIFAR-100 and ImageNet, and models with 1-bitper-weight requiring less than 10 MB of parameter memory, we achieve error rates of 3.9%, 18.5% and 26.0% / 8.5% (Top-1 / Top-5) respectively. We also considered MNIST, SVHN and ImageNet32, achieving 1-bit-per-weight test results of 0.27%, 1.9%, and 41.3% / 19.1% respectively. For CIFAR, our error rates halve previously reported values, and are within about 1% of our errorrates for the same network with full-precision weights. For networks that overfit, we also show significant improvements in error rate by not learning batch normalization scale and offset parameters. This applies to both full precision and 1-bit-per-weight networks. Using a warm-restart learning-rate schedule, we found that training for 1-bit-per-weight is just as fast as full-precision networks, with better accuracy than standard schedules, and achieved about 98%99% of peak performance in just 62 training epochs for CIFAR-10/100. For full training code and trained models in MATLAB, Keras and PyTorch see https://github.com/McDonnell-Lab/1-bit-per-weight/.
منابع مشابه
Training Wide Residual Networks for Deploy- Ment Using a Single Bit for Each Weight
For fast and energy-efficient deployment of trained deep neural networks on resource-constrained embedded hardware, each learned weight parameter should ideally be represented and stored using a single bit. Error-rates usually increase when this requirement is imposed. Here, we report large improvements in error rates on multiple datasets, for deep convolutional neural networks deployed with 1-...
متن کاملTraining Wide Residual Networks for Deploy- Ment Using a Single Bit for Each Weight
For fast and energy-efficient deployment of trained deep neural networks on resource-constrained embedded hardware, each learnt weight parameter should ideally be represented and stored using a single bit. Error-rates usually increase when this requirement is imposed. Here, we report methodological innovations that result in large reductions in error rates across multiple datasets for deep conv...
متن کاملUsing a Fuzzy Rule-based Algorithm to Improve Routing in MPLS Networks
Today, the use of wireless and intelligent networks are widely used in many fields such as information technology and networking. There are several types of these networks that MPLS networks are one of these types. However, in MPLS networks there are issues and problems in the design and implementation discussion, for example security, throughput, losses, power consumption and so on. Basically,...
متن کاملMulti-channel Medium Access Control Protocols for Wireless Sensor Networks: A Survey
Extensive researches on Wireless Sensor Networks (WSNs) have been performed and many techniques have been developed for the data link (MAC) layer. Most of them assume single-channel MAC protocols. In the usual dense deployment of the sensor networks, single-channel MAC protocols may be deficient because of radio collisions and limited bandwidth. Hence, using multiple channels can significantly ...
متن کاملMulti-channel Medium Access Control Protocols for Wireless Sensor Networks: A Survey
Extensive researches on Wireless Sensor Networks (WSNs) have been performed and many techniques have been developed for the data link (MAC) layer. Most of them assume single-channel MAC protocols. In the usual dense deployment of the sensor networks, single-channel MAC protocols may be deficient because of radio collisions and limited bandwidth. Hence, using multiple channels can significantly ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- CoRR
دوره abs/1802.08530 شماره
صفحات -
تاریخ انتشار 2018